33 research outputs found

    Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions

    Get PDF
    Head-pose estimation has many applications, such as social event analysis, human-robot and human-computer interaction, driving assistance, and so forth. Head-pose estimation is challenging because it must cope with changing illumination conditions, variabilities in face orientation and in appearance, partial occlusions of facial landmarks, as well as bounding-box-to-face alignment errors. We propose tu use a mixture of linear regressions with partially-latent output. This regression method learns to map high-dimensional feature vectors (extracted from bounding boxes of faces) onto the joint space of head-pose angles and bounding-box shifts, such that they are robustly predicted in the presence of unobservable phenomena. We describe in detail the mapping method that combines the merits of unsupervised manifold learning techniques and of mixtures of regressions. We validate our method with three publicly available datasets and we thoroughly benchmark four variants of the proposed algorithm with several state-of-the-art head-pose estimation methods.Comment: 12 pages, 5 figures, 3 table

    Mapping Sounds on Images Using Binaural Spectrograms

    Get PDF
    International audienceWe propose a novel method for mapping sound spectrograms onto images and thus enabling alignment between auditory and visual features for subsequent multimodal processing. We suggest a supervised learning approach to this audio-visual fusion problem, on the following grounds. Firstly, we use a Gaussian mixture of locally-linear regressions to learn a mapping from image locations to binaural spectrograms. Secondly, we derive a closed-form expression for the conditional posterior probability of an image location, given both an observed spectrogram, emitted from an unknown source direction, and the mapping parameters that were previously learnt. Prominently, the proposed method is able to deal with completely different spectrograms for training and for alignment. While fixed-length wide-spectrum sounds are used for learning, thus fully and robustly estimating the regression, variable-length sparse-spectrum sounds, e.g., speech, are used for alignment. The proposed method successfully extracts the image location of speech utterances in realistic reverberant-room scenarios

    Variational Inference and Learning of Piecewise-linear Dynamical Systems

    Get PDF
    International audienceModeling the temporal behavior of data is of primordial importance in many scientific and engineering fields. Baseline methods assume that both the dynamic and observation equations follow linear-Gaussian models. However, there are many real-world processes that cannot be characterized by a single linear behavior. Alternatively, it is possible to consider a piecewise-linear model which, combined with a switching mechanism, is well suited when several modes of behavior are needed. Nevertheless, switching dynamical systems are intractable because their computational complexity increases exponentially with time. In this paper, we propose a variational approximation of piecewise linear dynamical systems. We provide full details of the derivation of two variational expectation-maximization algorithms, a filter and a smoother. We show that the model parameters can be split into two sets, static and dynamic parameters, and that the former parameters can be estimated off-line together with the number of linear modes, or the number of states of the switching variable. We apply the proposed method to a visual tracking problem, namely head-pose tracking, and we thoroughly compare our algorithms with several state of the art trackers

    Mapping Sounds on Images Using Binaural Spectrograms

    Get PDF
    International audienceWe propose a novel method for mapping sound spectrograms onto images and thus enabling alignment between auditory and visual features for subsequent multimodal processing. We suggest a supervised learning approach to this audio-visual fusion problem, on the following grounds. Firstly, we use a Gaussian mixture of locally-linear regressions to learn a mapping from image locations to binaural spectrograms. Secondly, we derive a closed-form expression for the conditional posterior probability of an image location, given both an observed spectrogram, emitted from an unknown source direction, and the mapping parameters that were previously learnt. Prominently, the proposed method is able to deal with completely different spectrograms for training and for alignment. While fixed-length wide-spectrum sounds are used for learning, thus fully and robustly estimating the regression, variable-length sparse-spectrum sounds, e.g., speech, are used for alignment. The proposed method successfully extracts the image location of speech utterances in realistic reverberant-room scenarios

    A Distributed Architecture for Interacting with NAO

    Get PDF
    International audienceOne of the main applications of the humanoid robot NAO – a small robot companion – is human-robot interaction (HRI). NAO is particularly well suited for HRI applications because of its design, hardware specifications, programming capabilities, and affordable cost. Indeed, NAO can stand up, walk, wander, dance, play soccer, sit down, recognize and grasp simple objects, detect and identify people, localize sounds, understand some spoken words, engage itself in simple and goal-directed dialogs, and synthesize speech. This is made possible due to the robot's 24 degree-of-freedom articulated structure (body, legs, feet, arms, hands, head, etc.), motors, cameras, microphones, etc., as well as to its on-board computing hardware and embedded software, e.g., robot motion control. Nevertheless, the current NAO configuration has two drawbacks that restrict the complexity of interactive behaviors that could potentially be implemented. Firstly, the on-board computing resources are inherently limited, which implies that it is difficult to implement sophisticated computer vision and audio signal analysis algorithms required by advanced interactive tasks. Secondly, programming new robot functionalities currently implies the development of embedded software, which is a difficult task in its own right necessitating specialized knowledge. The vast majority of HRI practitioners may not have this kind of expertise and hence they cannot easily and quickly implement their ideas, carry out thorough experimental validations, and design proof-of-concept demonstrators. We have developed a distributed software architecture that attempts to overcome these two limitations. Broadly speaking, NAO's on-board computing resources are augmented with external computing resources. The latter is a computer platform with its CPUs, GPUs, memory, operating system, libraries, software packages, internet access, etc. This configuration enables easy and fast development in Matlab, C, C++, or Python

    Head Pose Estimation via Probabilistic High-Dimensional Regression

    Get PDF
    International audienceThis paper addresses the problem of head pose estimation with three degrees of freedom (pitch, yaw, roll) from a single image. Pose estimation is formulated as a high-dimensional to low-dimensional mixture of linear regression problem. We propose a method that maps HOG-based descriptors, extracted from face bounding boxes, to corresponding head poses. To account for errors in the observed bounding-box position, we learn regression parameters such that a HOG descriptor is mapped onto the union of a head pose and an offset, such that the latter optimally shifts the bounding box towards the actual position of the face in the image. The performance of the proposed method is assessed on publicly available datasets. The experiments that we carried out show that a relatively small number of locally-linear regression functions is sufficient to deal with the non-linear mapping problem at hand. Comparisons with state-of-the-art methods show that our method outperforms several other techniques

    The nuclear and organellar tRNA-derived RNA fragment population in Arabidopsis thaliana is highly dynamic

    Get PDF
    In the expanding repertoire of small noncoding RNAs (ncRNAs), tRNA-derived RNA fragments (tRFs) have been identified in all domains of life. Their existence in plants has been already proven but no detailed analysis has been performed. Here, short tRFs of 19-26 nucleotides were retrieved from Arabidopsis thaliana small RNA libraries obtained from various tissues, plants submitted to abiotic stress or fractions immunoprecipitated with ARGONAUTE 1 (AGO1). Large differences in the tRF populations of each extract were observed. Depending on the tRNA, either tRF-5D (due to a cleavage in the D region) or tRF-3T (via a cleavage in the T region) were found and hot spots of tRNA cleavages have been identified. Interestingly, up to 25% of the tRFs originate from plastid tRNAs and we provide evidence that mitochondrial tRNAs can also be a source of tRFs. Very specific tRF-5D deriving not only from nucleus-encoded but also from plastid-encoded tRNAs are strongly enriched in AGO1 immunoprecipitates. We demonstrate that the organellar tRFs are not found within chloroplasts or mitochondria but rather accumulate outside the organelles. These observations suggest that some organellar tRFs could play regulatory functions within the plant cell and may be part of a signaling pathway.Cognat, Valerie Morelle, Geoffrey Megel, Cyrille Lalande, Stephanie Molinier, Jean Vincent, Timothee Small, Ian Duchene, Anne-Marie Marechal-Drouard, Laurence eng England 2016/12/03 06:00 Nucleic Acids Res. 2017 Apr 7;45(6):3460-3472. doi: 10.1093/nar/gkw1122.PMC538970
    corecore